TaRKD2 RNAseq quality control
All samples were downloaded from Novogene to the server and trimmed/quality controlled simultaneously. All samples passed quality checks and were pseudoaligned using kallisto to the IWGSC v1.1 RefSeq assembly with good alignment as highlighted in table 1B. Following this we imported raw counts into R and analysed using DESeq2.
Fastp quality control report
| Sample.Name | Duplication | M.Q30.reads | Mb.Q30.bases | GC.content | PF | Adapter |
|---|---|---|---|---|---|---|
| YL100_1 | 22.60% | 0 | 7811.0 | 57.30% | 98.20% | 1.90% |
| YL101_1 | 24.30% | 0 | 5804.1 | 56.70% | 98.90% | 5.60% |
| YL103_1 | 24.60% | 0 | 5422.7 | 56.30% | 98.90% | 5.70% |
| YL104_1 | 25.20% | 0 | 6669.9 | 56.40% | 99.00% | 6.20% |
| YL105_1 | 25.10% | 0 | 5598.0 | 56.70% | 99.00% | 4.70% |
| YL107_1 | 28.80% | 0 | 7499.0 | 56.20% | 99.00% | 5.40% |
| YL117_1 | 30.70% | 0 | 6233.2 | 56.10% | 99.00% | 5.30% |
| YL118_1 | 29.00% | 0 | 6002.6 | 56.00% | 99.00% | 5.30% |
| YL120_1 | 29.10% | 0 | 6059.3 | 55.40% | 98.90% | 5.50% |
| YL126_1 | 27.10% | 0 | 5390.5 | 55.30% | 98.90% | 5.60% |
| YL127_1 | 23.40% | 0 | 5620.5 | 55.50% | 99.10% | 5.00% |
| YL129_1 | 23.10% | 0 | 5686.6 | 55.40% | 98.90% | 5.60% |
| YL130_1 | 23.50% | 0 | 6220.2 | 55.60% | 98.80% | 5.00% |
| YL131_1 | 27.10% | 0 | 6568.5 | 55.30% | 98.90% | 4.50% |
| YL133_1 | 27.10% | 0 | 6873.0 | 55.80% | 98.90% | 4.80% |
| YL136_1 | 26.00% | 0 | 5540.2 | 55.60% | 98.90% | 5.20% |
| YL137_1 | 26.10% | 0 | 5562.9 | 55.10% | 98.80% | 5.00% |
| YL138_1 | 24.80% | 0 | 5024.7 | 55.30% | 98.90% | 4.90% |
| YL141_1 | 27.80% | 0 | 6007.7 | 55.30% | 98.80% | 4.20% |
| YL144_1 | 28.40% | 0 | 6118.8 | 55.50% | 98.70% | 4.60% |
| YL145_1 | 31.00% | 0 | 6178.3 | 55.00% | 99.00% | 4.80% |
Kallisto pseudoalignment statistics
| Samples | Processed | Pseudoaligned | Pseudoalignment.percentage |
|---|---|---|---|
| YL100 | 29098807 | 21094024 | 72.5 |
| YL101 | 20892422 | 15355144 | 73.5 |
| YL103 | 19512334 | 14130936 | 72.4 |
| YL104 | 23926036 | 17299381 | 72.3 |
| YL105 | 20039046 | 14655877 | 73.1 |
| YL107 | 26950183 | 19621777 | 72.8 |
| YL117 | 22368139 | 16189942 | 72.4 |
| YL118 | 21562836 | 15955804 | 74.0 |
| YL120 | 21751789 | 15736861 | 72.3 |
| YL126 | 19332424 | 15298742 | 79.1 |
| YL127 | 20080398 | 15916542 | 79.3 |
| YL129 | 20431939 | 15449441 | 75.6 |
| YL130 | 22380678 | 16631553 | 74.3 |
| YL131 | 23564205 | 17761630 | 75.4 |
| YL133 | 24665464 | 18886945 | 76.6 |
| YL136 | 19903325 | 15478032 | 77.8 |
| YL137 | 19981361 | 15554246 | 77.8 |
| YL138 | 18020898 | 13928150 | 77.3 |
| YL141 | 21521086 | 16815205 | 78.1 |
| YL144 | 22045066 | 17183564 | 77.9 |
| YL145 | 22129793 | 17188124 | 77.7 |
Dispersion plot estimates and fitting from DESeq2 normalisation
DESeq2 normalisation involves estimating the dispersion (variation) of counts within each gene based on the inputted samples. DESeq2 fits them all to rough estimates of how much dispersion should be happening and excludes those that don’t fit the trend.
Figure 2A. Dispersion estimates DESeq2
Normalised counts of TaRKD1-7A, 7B, 7D and synTaRKD1-7D
Normalised counts of natural and synthetic TaRKD1s in our samples. Good expression of synTaRKD1-7D with a little TaRKD1-7B in one of the sample but this is likely mismapping. Expression is higher in meristem tissue than immature embryo tissue.
Figure 3A. Normalised counts for TaRKD1-7A plotted with ggplotCounts
Figure 3B. Normalised counts for TaRKD1-7B plotted with ggplotCounts
Figure 3C. Normalised counts for TaRKD1-7D plotted with ggplotCounts
Figure 3D. Normalised counts for synTaRKD2 plotted with ggplotCounts
PCA analysis of normalised counts
PCA shows good seperation between tissue types along PC1 (95% variance), while the meristem samples are seperated along PC2 (4% variance) according to the presence of DMSO + Estradiol in treatements. There appears to be greater seperation between meristem tissue samples than immature embryo tissue.
Figure 4A. PCA of all samples
Similar to the Leaf TaRKD2 RNAseq TE and TM have distinct expression profiles relative to the wild type which might be due to dosage specific effects on the transcriptional network. For these samples I will use use overlapping (shared) diffrentially expressed genes between (TE vs WE) and (TM vs WE).
Figure 4B. PCA of Immature embryo samples
Meristem tissue doesn’t look great from what I can tell. What this PCA suggests is that the main seperation between these samples is those with DMSO in the treatment and those without (94% variance). Maybe the majority of the tissue we extracted was too diffrentiated? Laser microdissection or single cell RNAseq would probably be the only way to resolve this.
Figure 4C. PCA of Meristem samples
Venn diagram of diffrential comparisons
Pairwise differential comparisons were carried out on normalised counts at an adjusted p value (Benjamini-Hochberg) of 0.01 with an lfcThreshold of 0 (lfcThreshold=1 resulted in very few DEGs). We’ve 1030 DEGs for shared diffrential comparison between TE and TM comparisons to WN with good results downstream for functional analysis and there are 418 dosage related DEGs for the TE vs TM comparison.
Figure 5A. PCA of IE samples
Higher number of DEGs for TN vs WN comparison and some dosage dependent effect for the TE vs TM comparison.
Figure 5B. PCA of Meristem samples
Enhanced Volcano plot of differential comparisons
Can’t use diffrential comparisons for shared DEGS so I’ve used Immature embryo TE vs WE here. We can see there is more upregulation than downregulation, however the logfold change isn’t massive.
Figure 6A. Enhanced volcano plot IE TE vs WE
Equal amount of upregulation and downregulation in the meristem comparison with high logfold change values for TN vs WN. There are some decent functional terms in this comparison from what I’ve seen.
Figure 6B. Enhanced volcano plot Meristem TN vs WN
Heatmaps
Immature embryo shared DEGS show good heatmap clustering with a nice dosage specific trend between those in the TM and those in the TE category for both upregulated and downregulated WN genes. This will be a nice comparison to characterise the phenotype.
Figure 7A. Heatmap IE (TE vs NE DEGs)
We can see clearly see here that the TN vs WN is notreliable by itself as it shows many difference to TM and TE. However if we intersect all of the differential comprisons to WN we can see a subsection of DEGs that might explain our phenotype in vegetative tissue.
Figure 7B. Heatmap Meristem (TN vs WN DEGs)
GO term analysis of Upregulated DEGs
Gene ontology term analysis was carried out on upregulated DEGs for pairwise comparisons using gProfiler at a signficance of 0.05 using gProfiler’s custom FDR. I’ve annotated terms that I thought were relevant with their significance statistics below. Shared upregulated Immature Embryo DEGs show a nice enrichment of terms involved in transcription factor binding, chromatin packaging and accessability which is consistent with what we’ve observed with histone H4.
Figure 8A. gProfiler GO term analysis IE Shared DEGs
GO term analysis of Meristem tissue showed statistical enrichment of a few terms that could be construed to be involved in cell division such as cell wall related terms but this might be a stretch.
Figure 8B. gProfiler GO term analysis Meristem Shared DEGs
TE vs TM in both tissues just shows enrichment of transcription factor and ATPase processes which is consistent with dosage dependent changes of transcription factor expression.
Figure 8C. gProfiler GO term analysis IE TE vs TM DEGs
Figure 8D. gProfiler GO term analysis Meristem TE vs TM DEGs
Enrichment of Transcription Factor expression
Here I’ve used a program called gage to investigate whether there is significant differences in the expression of Transcription Factor families in indTaRKD1-7D Immature embryos vs WT Immature embryos + indTaRKD1-7D meristems vs WT meristems.
For the immature embryo enrichment we see a high enrichment in expression of NAC TFs which agrees nicely with our phenotype as well as AP2 TFs which should lead us into a nice self citation with OsRKD3. Downregulation of ARF could agree with the fact that most meristematic processes are cytokinin related?
Figure 9A. GAGE Transcription factor expression enrichment IE indTaRKD1-7D vs WT
For meristem tissue we can see clear enrichment of AP2 once again as well as WRKY and tify (Jasmonate regulators). I prefer the immature embryo TF enrichment though as NAC really stands out.
Figure 9B. GAGE Transcription factor expression enrichment Meristem indTaRKD1-7D vs WT
Normalised counts of interesting genes - Immature embryos
All plots were made by using a script from the the IDEAL package called ggplotCounts. I selected DEGs by overlapping them with various dataset, inputting these to BioMart and chose the ones that looked most relevant to tillering/spike development.
Immature embryos DEGs that overlap with Microspore dataset
MYB related transcription factor that has a methyltransferase downstream and a Sucrose synthase.
Figure 10A. Normalised counts MYB
Figure 10B. Normalised counts Sucrose
Immature embryos DEGs that overlap with NIL High tillering dataset
2 Histone H4 homeologues and UCL8 that has been reported in literature to confer higher grain yield in rice.
Figure 10C. Normalised counts Histone H4 B
Figure 10D. Normalised counts Histone H4 D
Figure 10E. Normalised counts UCL8
Immature embryos TF DEGs
NAC NLT5 (flowering time supp), OsCOL4 (flowering time suppressor), and MADS51.
Figure 10F. Normalised counts NLT5
Figure 10G. Normalised counts OsCOL4
Figure 10H. Normalised counts MADS51